Random graph models for directed acyclic networks 
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We study random graph models for directed acyclic graphs, an important class of networks that 
includes citation networks, food webs, and feed-forward neural networks among others. We propose 
two specific models, roughly analogous to the fixed edge number and fixed edge probability variants 
of traditional undirected random graphs. We calculate a number of properties of these models, 
including particularly the probability of connection between a given pair of vertices, and compare 
the results with real-world acyclic network data finding that theory and measurements agree sur- 
prisingly well — far better than the often poor agreement of other random graph models with their 
corresponding real-world networks. 



I. INTRODUCTION 

A directed acyclic graph is a directed graph with no 
cycles — closed paths across the graph that start and end 
at the same vertex and follow edges only in their for- 
ward direction. Directed acyclic graphs are a fundamen- 
tal class of networks that occur widely in natural and 
man-made settings. The best-studied examples are ci- 
tation networks, networks in which the vertices repre- 
sent documents and the directed edges represent cita- 
tions between them. Citation networks of learned papers 
have long been an object of study in the information sci- 
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ences p], [2J, |3( and more recently in physics 
tation networks of patents Q and legal cases 
also received some attention in the last few years. Di- 
rected acyclic graphs occur in many other areas too. In 
biology, phylogenetic networks representing gene trans- 
fer are strictly acyclic and food webs are approximately 
so. In computer science and engineering acyclic or ap- 
proximately acyclic graphs occur in data structures, soft- 
ware call graphs, and feed- forward neural networks. In 
pure mathematics acyclic graphs are studied for their 
own sake [H, [ToL fill and as a representation of par- 
tially ordered sets [12j and random graph orders (l^flij. 
while in statistics the widely used Bayesian networks are 
an acyclic graph version of probabilistic graphical mod- 
els dl [H ii3 • 

Over the years, the study of networks has been sub- 
stantially illuminated by the development of random 
graph models. Such models include the original (Pois- 
son) random graph famously studied by Erdos and 
Renyi pjj [2(|, the configuration model of Molloy and 
Reed and others [U [22], O, [24[ and its generalizations to 
directed, bipartite, and other network types 12 a. l26l] . the 
small- world model of Watts and Strogatz [27| , exponen- 
tial random graphs [H, [2t|, and others. These models, 
combining simple definitions with complex but still an- 
alytically accessible structures, have provided an invalu- 
able window on the expected behavior of large networks, 
as well as serving as the starting point for many other 
more sophisticated models and calculations. 

To the best of our knowledge, however, no correspond- 
ing model has been studied for directed acyclic graphs — 



no equivalent of the configuration model for networks 
such as citation networks or food webs. In this paper, we 
propose such a model and study its properties in detail, 
giving derivations of a variety of quantities of interest, 
extensive numerical simulations, and comparisons with 
the behavior of real- world acyclic graphs, with which, in 
some cases, the model appears to be in surprisingly good 
agreement. A brief report of some of the material in this 
paper has appeared previously as Ref. [30{ . 



II. ACYCLIC GRAPHS AND ORDERED 
GRAPHS 

To correctly specify a random graph model for directed 
acyclic graphs it is crucial first to understand the reason 
why such graphs are acyclic in real life. In most prac- 
tical examples the acyclic nature of the network arises 
because the vertices are ordered. In citation networks 
and phylogenetic networks, for example, the vertices are 
time ordered: academic papers have a date or time of 
publication; species have a time of origination or spe- 
ciation. In food webs vertices are ordered according to 
tropic level. (Trophic level, however, is often only an 
approximate concept and not precisely defined, which is 
why some food webs are only approximately acyclic, con- 
taining a few violations of the no-loops condition.) In 
software call graphs, the vertices, representing functions 
or subroutines, are ordered according to the software ab- 
straction layer they occupy, and so forth. 

In each of these cases, it is the ordering of the ver- 
tices and not their acyclic structure that is the defini- 
tive property of the network. The acyclic structure is 
merely a corollary of the ordering. In citation networks, 
for instance, papers can only cite others that came before 
them in time, and this eliminates closed cycles because 
all paths in the network must lead backward in time and 
there are no forward paths available to close the cycle. 
Similarly in food webs species of higher trophic level prey 
on those of lower level. In software graphs functions at 
higher levels of abstraction call those at lower levels. The 
name "directed acyclic graph" is thus perhaps slightly 
misleading, focusing our attention, as it does, on the 
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acyclic property rather than the more fundamental order- 
ing. A better name might be "directed ordered graphs," 
but unfortunately the literature on this topic has long 
ago settled on the older name and it seems unwise to try 
and change it now. 

What is important for our purposes, however, is that a 
sensible random graph model for these networks should 
mirror the features seen in the real world and incorporate 
an underlying ordering of the vertices that then drives 
the acyclic structure. Thus the correct model is really 
a "random ordered graph" and this is the approach we 
take in this paper [3l| . 



III. RANDOM DIRECTED ACYCLIC GRAPHS 
WITH FIXED DEGREE SEQUENCES 

In this paper we propose two related random graph 
models of directed acyclic graphs. The two models are 
roughly analogous to the well known G{n, m) and G(n,p) 
versions of the standard Poisson random graph [l9j , one 
fixing the number of edges in the network exactly and 
the other fixing only the expected number. We begin by 
describing the u G(n, m)" version, which we introduced 
previously in Ref. [301 ] - The u G(n,p)" version, which is 
introduced for the first time in this paper, is described in 
Section [TV] 

Our first model takes as its input an ordered degree se- 
quence consisting of the in-degree kf 1 and out-degree fc° ut 
for each vertex i = 1 . . . n, where n is the total number of 
vertices in the network. The directed edges in the model 
are allowed to run only from vertices with higher indices 
to vertices with lower, and this constraint enforces the 
acyclic nature of the network. Thus we can have an edge 
running to vertex i from vertex j only if i < j. 

Throughout this paper we describe our networks in 
the language of time ordering: vertices are "earlier" or 
"later" in the network, meaning they have lower or higher 
indices, and the vertices with the lowest and highest in- 
dices are referred to as "first" and "last." The use of 
these terms is purely for convenience and should not be 
taken as restricting the model to networks in which the 
vertices are time ordered. The concepts we introduce can 
be applied equally to networks such as food webs and call 
graphs in which the ordering has nothing to do with time. 



individually equal to the total number m of edges in the 
network: 



(i) 



For a directed acyclic graph, however, there are also addi- 
tional conditions. For instance, the first (i = 1) vertex in 
the graph can never have any outgoing edges, since there 
are no earlier vertices for such edges to attach to. Thus 
k° ut — always in a graphical degree sequence. Similarly 
k™ = 0. More generally, we can derive a condition on the 
out-degree of every vertex as follows. 

It is helpful to visualize in- and out-degrees as sets of 
"stubs" of edges pointing in and out of each vertex in the 
appropriate numbers. To create a complete network we 
need to match the stubs in pairs, out with in, to make 
whole edges, and a degree sequence is graphical only if 
all stubs can be matched while respecting the ordering of 
the vertices. 

The number of stubs outgoing from vertices below ver- 
tex i is X^-=i kj Ut an< ^ eacn sucn stub must be matched 
with an ingoing stub at a vertex below i, of which there 
Sfci kf 1 - The number of ingoing stubs below i that are 
left over after we do this matching is 



i-l 



(2) 



This is the number of ingoing stubs below vertex i that 
are available to attach to outgoing stubs at i and above. 
Note that this number is determined entirely by the de- 
gree sequence — it does not depend on any of the details 
of which vertices are connected to which others. 

Now consider vertex i itself. Its out-degree fc° ut is the 
number of its outgoing stubs, and each of those stubs 
must be matched with an ingoing one below i. That 
means that k° nt cannot be greater than /i.; above — if it 
were, then there would not be enough in-stubs available 
for i's out-stubs to attach to and the degree sequence 
would not be graphical. Thus a necessary condition for 
a degree sequence to be graphical is 



i-l 



fcr<E fc ^E fc : 
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(3) 



i=i 



A. Graphical degree sequences 



For convenience, we define 



A first important point to notice is that not all degree 
sequences are realizable as ordered acyclic graphs of the 
type described here. By analogy with similar issues in 
other branches of graph theory, we will refer to realizable 
degree sequences as graphical. 

As with all directed graphs, if a degree sequence is to 
be graphical the sum of the in-degrees of all vertices must 
equal the sum of the out-degrees, since every edge that 
starts somewhere ends somewhere. Both sums are also 



A < E'v £* 
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J = l 3=1 

so that ([3]) can be written as 

A,; > 0. 



(4) 



(5) 



This condition must hold for all i if the degree sequence 
is to be graphical. 
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FIG. 1: The flux fj,i is equal to the number of edges from 
vertices i and above that connect to vertices below i. The 
excess flux Ai is the number of edges that go around vertex i, 
connecting vertices above to vertices below without passing 
through i. In this example \n = 6 and Ai = 5. 

Our earlier condition that fc° ut = trivially implies 
that Ai = 0, and fc™ = implies that X n = because 

71— 1 n 

a„ = ^ k T ~ E k ] Ut 

3=1 3=1 
= (TO - fc^ 1 ) - TO - 0, (6) 

where we have made use of Eq. . Thus we also have 

Ai = A„ = 0. (7) 

One might imagine that one could now make a similar 
argument about the in-degrees of each vertex and derive 
a second condition for graphical sequences of the form: 

n n 

E fc i ut -E fe f^°- (8) 

j=i+l j=i 

This is correct, but in fact it is just another form of the 
first condition, Eq. 0, as the reader can easily verify by 
applying Eq. 0. 

Equations and are a necessary condition for the 
degree sequence to be graphical. It's straightforward to 
show that they are also sufficient. The proof is a con- 
structive one: we build a network starting from the first 
vertex and working up. If ([5]) holds then at each vertex i 
we know that the number of free in-stubs at lower vertices 
is at least fc° ut , and hence there are in-stubs available to 
attach all of our out-stubs to. If we simply choose be- 
tween the available stubs in any way we like, create the 
appropriate edges, and move on to the next vertex, then 
so long as there are no unused in-stubs left when we get 
to the last vertex, which is guaranteed by Eq. 0, we 
will have built a complete graph and hence the sequence 
is graphical. 

Thus Eqs. ([5]) and are a necessary and sufficient 
condition for a graphical degree sequence. 

The quantities fa and Ai have a simple geometric in- 
terpretation as shown in Fig. [TJ If we make a cut in our 
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FIG. 2: Flux ^ for the network of citations between legal 
opinions of the US Supreme Court, plotted as a function of 
year of publication. The three dotted lines highlight dips in 
the flux and correspond roughly to three widely acknowledged 
shifts in the legal philosophy of the court: the start and end of 
the "Lochner era," during which the court took a strong anti- 
regulatory stance, and the start of the Warren court. (Note 
that the origin is suppressed on the vertical axis.) 

graph between vertices i and i — 1, the quantity fa is the 
number of edges that cross the cut, or the number flow- 
ing from higher to lower vertices. For this reason, we call 
fa the flux at vertex i. (Technically the flux is a property 
not of the vertex but of the gap between vertices i and 
i — but we have to give it a label so we choose to label 
it with the upper of the two vertices.) 

The quantity Ai is equal to the number of edges that 
flow "around" vertex i, meaning the number that run 
from vertices above i to vertices below. We call this 
quantity the excess flux at vertex i. Using Eq. 0, we 
can show that the flux and excess flux are related by 

Mi = A i + fc j out =A i _i +*£.!. (9) 

In the limit of large network size, as we will shortly see, 
the flux and excess flux are equal to one another to within 
a fraction of order 1/n, and we will refer to both simply 
as "flux" in this limit. 

The flux is a quantity of interest in its own right in 
real-world networks. Low values of flux indicate "bottle- 
necks" in a network — lines across which few edges flow — 
and high values indicate regions in which there are many 
edges. Figure [2J for example, shows the measured flux 
as a function of time for the network of citations be- 
tween legal decisions of the Supreme Court of the United 
States [8|. A number of dips in the flux are visible in 
the figure (marked with dotted lines). In legal terms, 
these dips correspond to temporal divisions between sets 
of opinions such that the earlier set is little cited by the 
later set. It is a reasonable guess that these divisions re- 
flect changes in legal thought that made older opinions 



obsolete, and indeed each of the three dips highlighted 
in the figure corresponds to an acknowledged shift in 
Supreme Court jurisprudence, as indicated. 



B. Definition of the model 

The definition of our random graph model is now 
straightforward. In the language of "stubs" introduced 
above, a graph on a graphical degree sequence is created 
by matching in- and out-going stubs in pairs to create m 
complete edges while respecting the ordering of the ver- 
tices (meaning that out-stubs can connect only to earlier 
in-stubs). Our model is defined to be the ensemble of all 
such matchings in which every matching appears with 
equal probability. 

This definition is the exact equivalent for directed 
acyclic graphs of the standard configuration model for 
undirected graphs [23j . In the configuration model one 
matches undirected stubs in pairs to create undirected 
edges and all matchings appear with equal probability in 
the ensemble. Note that in our model, as in the configu- 
ration model, multiedges are allowed. That is, the same 
pair of vertices can be connected by more than one edge. 
(Unlike the configuration model, there are no self-edges 
in an acyclic network, since this would violate the no- 
cycles rule.) Multiedges occur in some real- world acyclic 
networks, but not in others. In the model, however, they 
typically constitute a small 0(l/n) fraction of all edges, 
and so are negligible in the large system size limit. At 
the same time, a model that admits them is far easier to 
study analytically than a model that does not. 

Note also that the model includes random ordered 
trees — which have been widely studied in the past — as 
a special case. If every vertex in the network (other than 
the first) has out-degree 1 then the network is necessar- 
ily a tree and the ensemble is uniform over all ordered 
tree-like matchings with the given degrees. 

Although the model is simple and intuitive, there are — 
just as with the configuration model — some subtleties 
to its definition. An important point to notice is that 
matchings of stubs are not in one-to-one correspondence 
with network topologies. Imagine our stubs to be la- 
beled somehow, with letters or numbers, so that each 
one is uniquely identifiable. There will then, in general, 
be many different matchings that correspond to each pos- 
sible network topology. If we take a matching and sim- 
ply permute the labels of the out-stubs at a single ver- 
tex i, we produce a new matching corresponding to the 
same topology. The number of distinct such permuta- 
tions is fc° ut L We can similarly permute the in-stubs at 
vertex i for a total of k™\ permutations, and the num- 
ber of permutations of all stubs at all vertices is then 
fli k™\k° nt \. This, in the simplest case, is the number of 
matchings that correspond to each topology. Since this 
number is a function solely of the degree sequence, it is 
the same for all topologies, and hence if all matchings 
occur with equal probability p, then all topologies occur 
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FIG. 3: Top: a small directed acyclic network with four ver- 
tices and three edges. The stubs at each vertex are labeled 
with letters, and the four versions of the graph show the 
matchings of the stubs generated by permuting the stubs at 
each vertex. Each permutation generates a different match- 
ing, so there are in this case four matchings corresponding 
to the same graph, as we would expect since the product 
IT. k\ n \k° nt \ = 4 in this case. Bottom: a second graph with 
the same degree sequence, but now with a multiedge between 
the two center vertices. There are again four permutations of 
the stubs as shown, but now they correspond to only two dif- 
ferent matchings — close inspection reveals that the first and 
fourth matchings are the same, as are the second and third. 
Thus in this case there are only two matchings for this graph. 
If all matchings are generated with equal probability, as in our 
model, then the top graph will be generated twice as often as 
the bottom one. 



with equal probability pYii k™\k° ut \. 

Unfortunately, there is a complication: if there are 
multiedges in the graph then the argument breaks down. 
Figure [3] shows why. If we identically permute the in- 
stubs at one end of a multiedge and the out-stubs at the 
other end, then we do not generate a new matching — we 
get back the same matching we started with. We see 
this effect in the lower half of the figure, where the four 
distinct permutations of stubs generate only two distinct 
matchings. (The top half of the figure shows another 
graph with the same degree sequence but no multiedges 
and in this case each permutation generates a unique 
matching.) 

The net result is that our previous calculation over- 
counts the number of matchings per topology by a fac- 
tor of the number of permutations of edges within mul- 
tiedges. If there are no multiedges, then our previous 
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calculation is correct. If there are multiedges then the 
number of matchings is reduced by a factor of Yii<j Aijn 
where Aij is an element of the adjacency matrix, i.e., the 
number of edges between vertices i and j. Since this fac- 
tor depends on the number and multiplicity of the mul- 
tiedges, it follows that in general all topologies are not 
sampled with exactly equal probability in our model. 

In practice, this is not a significant problem. The same 
issue arises in the configuration model but does not re- 
duce the usefulness of that model. For the sake of preci- 
sion, however, we note that although our model samples 
matchings with equal probability, it samples topologies 
with unequal probabilities that depend on the number 
and multiplicity of multiedges. 

C. Computer generation of networks 
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FIG. 4: The probability that an edge (shown in bold) leaving 
vertex i does not connect to vertex i + 1 is given by Ai+i//ii+i. 

As soon as it is chosen, each stub is erased from the list 
by moving the list's last item into its place. The opera- 
tions for each stub can be performed in time 0(1), and 
hence the total running time is simply proportional to 
the total number of in-stubs, which is m. 



One attractive feature of the model proposed here is 
that it is straightforward to generate networks drawn 
from the model's ensemble on a computer. Previous 
methods for generating directed acycl ic g raphs have re- 
lied on Monte Carlo techniques [16|, [fl], [33] but these 
methods, while versatile, are quite slow. Our model, by 
contrast, allows us to generate networks rapidly, in time 
0(m), where m again is the total number of edges in 
the network. The algorithm, described briefly in [3(|, is 
based on the scheme outlined in Section UlI Al for building 
a network. Starting with n vertices and an appropriate 
number of stubs at each, we go through the vertices in 
order from 1 to n. For each vertex we randomly join its 
outgoing stubs to ingoing ones at lower vertices chosen 
uniformly from the set of all such in-stubs that are cur- 
rently unused. When all stubs have been matched in this 
fashion, the network is complete and the algorithm ends. 

It is straightforward to see that indeed this algorithm 
generates all matchings with equal probability. Consider 
the step of the algorithm at which out-stubs from ver- 
tex i are matched to suitable in-stubs. The number of 
out-stubs is k° ut and the number of in-stubs available 
to match them to is, by definition, equal to the flux fa. 
Thus the number of different matchings of stubs on this 
ith step is N{ = fa]/ (fa — fc° ut )! = fal/Xil, where we have 
used Eq. ([9]) in the second equality, and the algorithm 
chooses between these uniformly at random so that each 
one occurs with equal probability 1/iVj. Repeating the 
process for all n vertices generates a unique matching of 
the entire graph with probability 
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(10) 



This probability is clearly uniform over all possible 
matchings since it depends only on the degree distribu- 
tion and not on any details of the matching itself. 

The algorithm can be implemented efficiently by main- 
taining in an ordinary array a list of currently unclaimed 
in-stubs from which we choose at random on every step. 



D. Expected number of edges 

One of the most fundamental properties of our model is 
the expected number of directed edges between any two 
vertices i and j. We will denote this quantity Py. In the 
limit of large network size Py becomes small and is equal 
to the probability that there will be an edge between i 
and j. We assume that i < j in the following calculations, 
so that the edge in question always runs from j to i. 

Consider Fig. [4] and consider one of the ingoing edges 
at vertex i. That edge forms part of the flux im- 
mediately above i and of that flux k°+\ edges, chosen 
uniformly at random, originate at vertex i + while the 
remaining fa+i ~ kf^\ — flow around i + 1, forming 
the excess flux at i + 1. The probability that our par- 
ticular edge is one of the ones flowing around i + 1, i.e., 
that it does not originate at vertex i + 1 , is thus simply 
K+l/ fa+l- 

If our edge is to originate at vertex j, it must flow in 
this way around every intervening vertex from i + 1 all 
the way up to j — 1, and then finally it must originate at 
vertex j, which it docs with probability /c° ut //ij. Multi- 
plying the probabilities together, we find that the total 
probability of this particular edge originating at vertex j 



k ont ' 1 



tt _ fc0 ut nj+i ^ 



(ii) 



This is just for one of the ingoing edges at vertex i. There 
are k™ such edges in all, so the total expected number of 
edges from j to i is 



n- 7 ' -1 A 

P — jLinjLoutlli+l ' 



1 J ni 

rii+i w 



(12) 



We will find it convenient to write this expression in 
the form 



jLin Lout 

p 1 3 

r ii — 

m 



fij i 



(13) 
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where 



n 



i+1 ' 



n<+i w 



(14) 



The quantity k\ n k° nt jm is the expected number of edges 
between i and j in an ordinary (not acyclic) directed 
random graph with the same degree sequence, so fij rep- 
resents the factor by which that expected number is mod- 
ified in the acyclic graph. Alternatively, fij is m times 
the probability that a single in-stub at vertex i is con- 
nected to a single out-stub at vertex j. (The probability 
itself vanishes in the limit of large graph size but with 
the inclusion of the factor of m we get a quantity that 
tends to a nonzero limit, which will be useful when we 
come to consider properties of the graph as n — ► oo.) 

One complication in the expression for occurs if any 
flux in the denominator is zero. The expression gives the 
correct answer of zero for P^j if we adopt the convention 
that 0/0 = 1. However, it's usually better to analyze 
a graph divided by a zero flux cut as two independent 
graphs, since no edges cross the cut in such a network 
and the network forms two separate components. A net- 
work with zero excess flux does not necessarily form two 
separate components — the two parts of the network can 
by joined by a single common vertex at the top of one 
part and the bottom of the other — but the two parts can 
be treated independently anyway, with the shared ver- 
tex, if any, participating in both parts. Hence, in the 
following, we assume that /ij ^ and A, ^ except for 
i = l and i = n. 

Another useful expression for can be derived by 
multiplying both sides of Eq. (|14p by fyji with the con- 
dition that i and i' are both less than j and j'. Then 

nU+i Mi UiLi'+i m* 

fij'fi'j- 



fijfi'j' — 



U.iLi+1 Mi UUi'+i Mi 

(15) 

Thus we can freely swap indices on a product of two 
overlapping fs. In particular, if we set i' = 1 and j' = n, 
we find that 

finflj 



fij — 



h 



(16) 



and fij thus factors into a product of independent func- 
tions of i and j. This result is of some practical use, since 
it implies that in order to calculate /y or for any i 
and j we need only the quantities fi n and fij , which are 
O(n) in number and take 0(n) time to calculate. Once 
these are known, we can calculate any i^y in 0(1) time, 
which is as fast as the corresponding calculation for the 
configuration model, and far faster than direct applica- 
tion of Eq. (fT2")l . which takes O(n) time on average for 
each Pij. 

Perhaps the simplest way to implement this idea in 
practice is to define the two "dimcnsionlcss" quantities 

fin i fij 

Jin Jin 



(17) 



so that 

fij = flndibj. (18) 

Clearly a\ = b„ = 1 and, substituting from Eq. (| 14[) into 
Eq. (fl7|) , we find the values for other i,j to be 



n;= 2 Ai fiv a, i 

Mi 

n?" 1 Ai 



n-l 



n(-|). 

1=3 



(19a) 



(19b) 



where we have made use of Eq. (|9]) [l8j . We will use these 
expressions in a number of calculations in the following 
sections. 



E. Assortativity 

As an example of the application of the calculations 
in the previous section, consider vertex correlations or 
"assortativity" in acyclic networks (33j . 

Consider a quantity x defined on all vertices i of a 
network. The network is said to be assortative with re- 
spect to x if edges tend to connect vertices with simi- 
lar values of x, high with high and low with low. Con- 
versely, if edges connect dissimilar values, high with low 
and vice versa, then the network is said to be disassor- 
tative. Assortativity can be quantified by calculating a 
standard Pearson correlation coefficient r over all pairs of 
values Xi,Xj on vertices i,j connected by an edge. Pos- 
itive values of r indicate assortative networks, negative 
values disassortative ones. 

In a directed network, such as the acyclic networks 
considered here, more complex types of correlations are 
also possible. For instance, one can consider two different 
quantities, x and y, each defined on all vertices, and then 
ask about the correlations between pairs of values Xi , j/j 
on vertices i,j connected by a directed edge from j to i. 
(The simpler example above with only one quantity x can 
be considered as the special case in which y = x.) Again 
one can calculate a correlation coefficient that quantifies 
the level of assortativity or disassortativity. The correla- 
tion coefficient is given explicitly in terms of the standard 
adjacency matrix by 

— ^ Mj^iVj - MinMout i (20) 

ij 



ax &y 
where 

Min = / fcj 



Xi, /^out 



-J^fcfS. (21) 



and 



2 

a X = 


i 


m 
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4 = 




m 
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2 _ 2 
Ij Mout- 



(22a) 
(22b) 
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Conventional random graph models such as the config- 
uration model show no assortativity with respect to any 
quantity x, but random acyclic graphs can have nonzero 
assortativity. Consider Eq. (|2T))) for the acyclic case and 
notice that the only dependence on Aij is in the first 
term of the numerator. All the other terms depend only 
on the degree sequence of the network, and hence are 
constant for our acyclic graph model over all members of 
the model ensemble. Averaging over the ensemble and 
noting that the model average of Ay is simply Py from 
Eq. (fl~3"|) . we find that within our model 



axcry 
1 

(txcty 



•ri * — ^ 



fin 
2 



, (23) 



where we have used Eq. (|18p . In general, this expression 
can give nonzero values of r. We will see some examples 
in Section [V] for the particular case of assortativity with 
respect to vertex degree [13, [SH [36| , such as the case in 
which Xi = fc" 1 and yj — k° nt . 



F. Large system-size limit 

The developments so far are for a network of finite 
size with a specified degree sequence. Like other random 
graph models, however, random acyclic graphs become 
significantly simpler in a number of ways in the limit of 
large graph size. We examine that limit in this section. 

Let the number of vertices in our network be n as previ- 
ously. In the limit of large n we can no longer specify the 
complete degree sequence, since there are an infinite num- 
ber of vertices, so, as with other random graphs, we spec- 
ify instead a degree distribution, which is a joint proba- 
bility distribution over in- and out-degrees as a function 
of vertex order. We define a "time" variable t = i/n for 
the zth vertex, which falls in the range < t < 1, then 
let pt(k ln , fc out ) be the probability that a vertex at time t 
has in- and out-degrees k m and k° ut . Since vertices are 
uniformly distributed in time, this distribution is related 
to the overall (joint) degree distribution of the network 
by a simple integral: 



p(k in ,k out ) 



p t (k in ,k out )dt. 



(24) 



Unfortunately the full distribution pt(k m , k° ut ) is usu- 
ally impossible to measure for an observed network: mea- 
suring it would require us to build a double histogram of 
k ln and fc out for many small intervals of t and none of the 
real-world networks we have examined are large enough 
to give acceptable statistics for such a histogram. Luck- 
ily, however, it turns out that many interesting character- 
istics of the network can be calculated with a knowledge 
only of the moments of the degree distribution, and in 
most cases only the first moment, i.e., the mean degree. 



The mean in- and out-degrees at time t are given by 

OO CO 

fc in w= E E k in Pt (k in ,k° ut ), 



OO OO 



fc out w=E E k out p t {k in ,k° ut ), (25) 

and the overall average degree c of the network is 



k in (t)dt = / k out (t)dt 



(26) 



Both k m (t) and k out (t) are easily measured in practice (at 
least approximately) by performing running averages of 
the observed degrees over suitably chosen time intervals. 

For many of the calculations presented here we will use 
the rescaled quantities 



K ™(t) 



k in (t) 



K° Ut (t) = 



k out (t) 



c c 

which satisfy the normalization conditions 

,i ,i 

K m (t)dt= / K ont (t)dt = 1. 
i] Jo 



(27) 



(28) 



The quantity K ln (t) dt is the fraction of all in-stubs that 
are attached to vertices in the range t to t + dt, and 
similarly for n out (t) dt. The numbers of stubs are given 
by rriK ln (t)dt and mK out (t)dt, since m is the total number 
of stubs of each kind in the whole network. 

The flux below vertex i in the network is given by 
integrating these quantities up to a given vertex thus: 



m I [re m (t') 
/o 



'(<')] dt'. 



(29) 



where t = i/n as before. Note that, assuming the degree 
distribution remains constant as the network becomes 
large, the integral for given t also remains constant, but 
m = nc grows with network size. Thus the flux becomes 
arbitrarily large as n — > oo. For our purposes it is better 
to use a quantity that remains constant as n varies and 
so we define a rescaled flux 



fift) = ^ = /V"(0 - K° Ut (t')]dt'. 

m Jo 



(30) 



In the large system size limit, there is no difference be- 
tween the flux fi and the excess flux A: the two differ 
only by the number of stubs at a single vertex, which is a 
vanishing fraction of m in the limit of large network size, 
and hence Xi also varies as m and the rescaled excess flux 
\(t) = Xi/m is given by 



X(t) 



'(t'^df. 



(31) 



Physically [i(t) and X(t) are both equal to the fraction of 
edges that run from vertices after t to vertices before. 
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Applying these definitions, we can now calculate a va- 
riety of quantities in the n — > oo limit. To calculate the 
probability of connection between two vertices, we start 
with Eq. ([155)1 : 



1=2 V 1 



exp 



/ jLOUt 

s>( 1+ V 

,i=2 V 



(32) 



Observing, as above, that A/ goes as to in the large system 
size limit while kf nt remains constant and keeping terms 
to leading order, this becomes 



a, = exp 



1 jLOUt 

EH 

.1=2 1 



(33) 



And in the limit of large n, the sum becomes an integral: 



a(t) = exp 



/o A(f) 

Similarly, defining u = j'/n, Eq. (|19b[) becomes 



= exp 



du' 



and substituting both into Eq. (fTB|) we get 

/(t,«) = /(0,l)o(t)6(«), 



(34) 



(35) 



(36) 



where /y = f(i/n,j/n). Physically, f(t,u) is to times 
the probability that an in-stub at time t is connected to 
an out-stub at time u. The normalizing constant /(0, 1) 
can be calculated by noting that every in-stub must be 
connected to some out-stub, which means that 



f(t, u) K° Ut (u) du = 1. 



(37) 



Substituting for f(t, u) from Eq. (f3"6")) and setting t = 
then gives 



/(0,1) 



l -i-i 

6(M)K OUt (lt) dli 



(38) 



where we have made use of a(0) = 1. If we instead nor- 
malize by integrating over t we get the alternative form 



/(0,1) 



a(t) K in {t) dt 



(39) 



which gives the same answer but may be more convenient 
in some cases, depending on the forms of K m and K out . 

Armed with a value for f{t, u) we can now calculate 
the expected number of edges between two vertices in 
the network from Eq. (fT3"|) : 



p 



Lin jLout 
K i ft j 

TO 



f(i/n,j/n). 



(40) 



Alternatively, we can average this expression over the 
distributions of fc ln and fc out to get the average number 
of edges between a vertex at t and another at u: 



P{t,u) 



k m (t)k^(u) = £ K «n WK o«t (u)/(tjU)- 

to n 

(41) 

Since f(t,u) is independent of n for given n ln (t) 
and K° ut (u), Pij [and P(t,u)] goes as 1/n in a sparse 
graph as graph size becomes large and hence vanishes in 
the limit. This allows us to interpret Pij as a probability 
of connection between vertices in the n — -> oo limit — the 
expected number of edges and the probability of connec- 
tion are the same when both become small. 

We also note in passing the following useful relation 
between X(t) and f(t,u). From Eq. (fT2")) we have 



Pi— l,i — 



Un jLO 



(42) 



so that fi—n = m/fii. Setting t = i/n as before and 
Hi/m — A(i), this implies that 



A(f) 



/(*,*) 



(43) 



G. Examples 

To illustrate the application of these results let us look 
at some concrete examples. Consider a network with 
average degrees k m (t) = 2c(l - t) and k out (t) = 2ct, 
where c is now a free parameter controlling the overall 
mean degree. Then 



1 (t) = 2(l-t), K out (u) = 2u, 



and we find that 



/(*,«) 



2(1 - t)u 



and 



P(t,u) = 



2c(l -t)x leu 
2m(l-t)u 



2c 

1 

n 



(44) 



(45) 



(46) 



where we have used to = nc in the second equality. 

Thus the expected number of edges between every pair 
of vertices in this case is the same, and indeed one could 
exploit this fact to create a network with the degree se- 
quence above by taking an initially empty graph and 
placing a directed edge between each vertex pair with uni- 
form probability 2c/ 'n, oriented to point from the "later" 
vertex to the "earlier" one. Such a model has been stud- 
ied previously as a model of food webs, in which context 
it is known as the cascade model [37| . It's easy to see 
that the cascade model produces networks with a given 
degree sequence uniformly at random and thus is approx- 
imately equivalent to an acyclic random graph with the 
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same degree sequence as described in this paper. The 
equivalence is only approximate: the cascade model has 
a Bernoulli distribution of edges between any two ver- 
tices while our model has a Poisson distribution. This 
difference, however, vanishes in the limit of large graph 
size, where the edge probability becomes small, and thus 
in this limit the two models are the same. 

More generally, consider a model where a Poisson dis- 
tributed number of directed edges is placed between all 
pairs of vertices i,j with i < j. If the mean of the Pois- 
son distribution for each vertex pair can be written as a 
product of a quantity rj that depends on i but not on j 
and a quantity Sj that depends on j but not on i, then 
the model produces acyclic random graphs conditioned 
on the degree sequence. To prove this we write the prob- 
ability P of generating a particular graph thus: 



SiVj ( s i r j) 



n 



»<3 



A l3 \ 



II, 



IK 



r 



is a constant for all graphs and 



The factor JJ^ e~ s ^ 

the factor Y[i Sj* ?"j * is constant for a given degree se- 
quence. Thus the only variation in the probability P 
for graphs of given degree sequence comes from the fac- 
tor rii<i A*]- But this is the same factor by which the 
probability of such graphs varies in the random acyclic 
graph model — see Section IIII Bl — and thus, for a given 
degree sequence, the model above produces graphs with 
the same probabilities as the random acyclic graph and 
the two models have identical ensembles. The cascade 
model is a particularly simple instance of this situation 
in which Vi and Sj are both constant. 

As another example, we consider networks with power- 
law degree distributions, which have received a lot of 
attention in the recent networks literature. In partic- 
ular, for reasons that will shortly become clear, we con- 
sider networks gene rated by linear preferential attach- 
ment processes [38|], which naturally generate directed 
acyclic graphs and have long been used as models of 
citation networks [3{|. We consider the general model 
in which vertices added continually to a growing net- 
work make c directed connections each to previously ex- 
isting vertices chosen at random in proportion to the 
current in-degrees of those vertices plus a constant r. 
This process produces networks with overall in-degree 
distributions having a power-law tail p(k) ~ k~ a where 
a = 2 + r/c [39l . |40| . In the notation used in this pa- 
per the average in-degree as a function of time is given 
by 0: 



K in (t) = (a-2)(f 



-1), 



(48) 



and K out (u) = 1. 

Let us consider a random directed acyclic graph built 
on degree sequences generated by the linear preferential 
attachment model and let us calculate the probability 
of connection between vertices. Feeding the expressions 
above for K m (t) and K ont (u) into our earlier formulas, we 



find that 

/(*,«) = 

and 



( a _ 1)(1 _ t l/(a-l)) u (a-2)/(a-l) 



P(t, u) = c(a - 2 ) l - 1 /(«-i) r («-2)/(«-i) 5 



(49) 



(50) 



where again t = i/n and u = j/n. Remarkably, this 
is precisely the average probability of an edge between 
vertices in the preferential attachment model itself [4l| . 
Indeed, as we will shortly show, the linear preferential 
attachment ensemble and the ensemble of the random 
acyclic graph with the same degree sequence are actually 
identical, because linear preferential attachment, condi- 
tioned on the degree sequence, produces matchings uni- 
formly at random, which is precisely the condition for the 
random acyclic graph. Thus, not only is P(t, u) the same 
for the two models, but all properties of the models are 
identical and one can properly say that the linear prefer- 
ential attachment model is a special case of the random 
directed acyclic graph. 

This is an important point. It is often claimed that 
networks produced by the linear preferential attachment 
process are, in some sense, not really random, being 
nonuniform in their ensemble properties because they are 
grown according to a noncquilibrium growth process. In 
fact, however, this is not the case. Once the acyclic na- 
ture of the networks is taken into account, the ensemble 
of the linear preferential attachment model is perfectly 
uniform for a given degree sequence. 

To prove this we compute the probability of a particu- 
lar matching being produced by the linear preferential at- 
tachment model as a function of in-degree sequence. An 
outgoing edge at a newly added vertex j in the growing 
preferential attachment network attaches to a previous 
vertex i with probability proportional to Vs current in- 
degree k™ plus the constant r. The correctly normalized 
probability of attachment is 



(51) 



+ U-iy 



where m = k] n is the current number of edges in the 
network. The probability of the entire matching is given 
by the product of this expression over all edges. Let us 
consider the numerator and denominator of the product 
separately, starting with the numerator. 

The current in-degree of vertex i is when the first 
edge attaches to it, 1 when the second edge attaches, and 
so forth. Hence the factors for vertex i in the numerator 
are 



r(l + r) . . . (kf 1 - 1 + r) = 



Tjkf 1 + r) 

r(r) : 



(52) 



where kf 1 now represents the final in-degree of i at the 
end of the growth process and T(x) is the standard 
gamma function. Taking the product over all vertices, 
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the complete numerator is n£=i + r ) /^( r ) ■ (There 

is no term for the last vertex since it necessarily has no 
ingoing edges.) 

For the denominator, we note that the number of 
edges m in the network increases by one for each edge 
added and takes the value (j — 2)c for the first edge added 
with vertex j and (j — l)c — 1 for the last. Thus the fac- 
tors in the denominator corresponding to the edges added 
with vertex j give 



[(j - 2)c + (j - l)r] . . . [(j - l)c - 1 + (j - l)r] 

r((j-l)(c + r)) 
r((j-l)(c + r)-c)' 

and the complete denominator is 



r((j-i)(c + 0) 

l = lr((j-l)(c + r)-c) 



n 



n-1 

n 



r(»(c + Q) 

^ r(i(c + r) - c) 



(53) 



(54) 



Dividing numerator by denominator, the complete 
probability for the matching is then 



of vertices. However, in that model edges are not inde- 
pendent because the presence of one edge connecting to 
a given vertex % reduces the number of stubs available for 
other edges and hence reduces the probability of edges 
from other vertices. In the limit of large network size, 
the probabilities for edges to and from intervals dt and 
du become independent, but even in this limit edges that 
share the same exact vertex, either as source or target, 
remain correlated. 

The same phenomenon is also seen in other random 
graph models, such as the configuration model, in which 
degrees are also fixed and the presence of one edge to a 
vertex reduces the probability of others. In that case, 
researchers have found it useful to study a slightly dif- 
ferent model in which edges are placed with the same 
probability as in the configuration model, but indepen- 
dently [13, 5H, Hi| . The same strategy turns out also to 
work well in the case of acyclic graphs. The resulting 
model is described in this section. 



Definition of the model 



2 = 1 



r(jfcf + r) T(i(c + r) - c) 
T(r) T(i(c + r)) ' 



(55) 



Since this probability depends only on the degree se- 
quence and not on any details of which vertices attach 
to which others, it follows that the preferential attach- 
ment process generates all matchings with a given degree 
sequence with the same probability, and hence that the 
set of networks with that degree sequence constitutes a 
random directed acyclic graph of the type considered in 
this paper. 

Note that a calculation similar to the one above can 
be performed for a model in which out-degree is not the 
same for every vertex, but varies from one vertex to an- 
other, or a network in which the parameter r varies be- 
tween vertices. The probability of a particular matching 
for such a model is still a function only of the degrees and 
other parameters and not of the pattern of connections 
in the network and hence the network is still a random 
graph of the type considered here. 



IV. RANDOM DIRECTED ACYCLIC GRAPHS 
WITH INDEPENDENT EDGE PROBABILITIES 

In this section we define the second of our two ran- 
dom graph models for acyclic graphs. In this model 
rather than fixing the degree of each vertex we fix only 
the expected degree. As discussed in the introduction 
the model is in some ways analogous to the G(n,p) 
model of Erdos and Renyi 19] for ordinary (Poisson) ran- 
dom graphs, while the previous model is the equivalent 
of G(n, m). 

We have seen that it is possible in our previous model 
to calculate the probability of an edge between any pair 



Our second model is defined as follows: starting with 
an empty graph of n vertices we generate for each pair 
of vertices with i < j, a Poisson distributed number 
with mean Py and place that number of edges between 
i and j, pointing from j to i. The values of Py are 
typically calculated from a desired degree sequence using 
Eq. (|13[) , and the resulting network trivially has the same 
expected number of edges between every vertex pair as 
the network generated by our first model with the same 
degree sequence, but the edges are now, by construction, 
independent. 

Since the number of edges between every vertex pair 
is Poisson distributed, so also is the total number of 
edges m. Thus an equivalent way to create networks 
drawn from this model is to generate a Poisson dis- 
tributed random number m with mean equal to the de- 
sired expected number of edges, then distribute those 
edges at random over the graph in proportion to Py. 
This second method for generating networks is a more 
efficient one for numerical work but the first is more con- 
venient for analytic treatment of the model. 

The principal disadvantage of this model is that it 
does not allow us to fix the exact degrees of each ver- 
tex. _Instead we can only fix the expected degrees kf 1 
and k° nt . The expected in-degree, for instance, is given 
by Xy=i+i Fiji which is by definition equal to the value 
of k\ n used to calculate Py in the first place. In other 
words, the network has expected degrees equal to the 
chosen degree sequence, but the actual degrees may be 
different. 

In fact, since the numbers of edges are Poisson inde- 
pendent variables, the in-degree will also be Poisson dis- 
tributed with mean kf 1 (and similarly for the out-degree) . 
Note however that this does not mean that the overall 
distribution of the degrees at any time has to be Poisson, 
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since the distribution from which the means themselves 
are drawn can be anything we like and the overall dis- 
tribution of degrees is a convolution of this distribution 
and the Poisson distribution. 

The expected degrees also need not be integers, so this 
model allows a slight generalization of the previous one in 
that the values of k\ n and fc° ut we use to calculate Py need 
not be integers. Indeed we could generalize the model 
considerably further, since in principle we can choose the 
values of the Py to be anything we want, including values 
that cannot be generated from Eq. (|13[) by any choice 
of degrees. Any values, for example, that do not take 
the product form of Eq. (fl"3|) fall in this category. In 
this paper, however, we will mostly be concerned with 
choices of Pij that correspond to an underlying choice of 
expected degrees. 



B. Computer generation of networks 

It is less straightforward to numerically generate net- 
works drawn from the ensemble of our second model than 
of our first. The basic approach is as outlined above: 
given the expected degrees, we calculate the expected 
number of edges by summing to = X)<=i an d then 
generate a Poisson distributed number with this mean, 
which will be the actual number of edges to. 

To place these to edges with the appropriate proba- 
bilities we need to be able to randomly generate vertex 
pairs with probabilities proportional to Py. This can 
conveniently be achieved by making use of the product 
form (|13p of Pij . We draw a value for i from the marginal 
probability distribution, which goes as Ylj=i+i Pij ~ ^fN 
using a standard transformation method, which takes 
O(logn) time. Then we draw a value for j between i + 1 
and n in proportion to k° ut bj, again using the transfor- 
mation method. Then we place an edge between i and j 
and repeat for the next edge. When all m edges have 
been placed the graph is complete. The whole process 
takes O(n) time for set-up and O(mlogn) for selection 
and placing of edges, or 0(n + mlogn) time in total, 
which is O(nlogn) on a graph with fixed degree distri- 
bution so that to oc n. 



V. COMPARISON WITH EMPIRICAL DATA 

Our expressions for edge probabilities allow us to make 
a comparison between our model networks and their 
counterparts in the real world. We focus on citation net- 
works, which are the largest and best documented exam- 
ples of acyclic networks. 

The simplest comparison we could make would be a 
direct comparison of edge probabilities Py. However, 
the value of Py is strongly influenced by the degrees of 
vertices — the initial factor of k\ n k° ut in Eq. (fT3|) — which 
makes comparison plots noisy and difficult to interpret by 
eye. A cleaner comparison is of the stub probability fij, 



Eq. (fT"4|) , which is to times the probability that a stub at 
vertex i is connected to a stub at vertex j. 

We can make an estimate of fij for an observed net- 
work by taking a window of vertices around i and another 
around j, counting the number of edges between vertices 
in the two windows, and then dividing in turn by the 
number of in-stubs in the first window and out-stubs in 
the second and multiplying by to [451 ] . If the windows are 
large enough to provide good statistics but small enough 
to span only a relatively narrow range of i and j then 
one can get good estimates of the mean stub probability 
this way. 

In Fig. [5] we show the results of such measurements 
for two citation networks. The first is a network of ci- 
tations between academic papers in the area of theoret- 
ical high-energy physics, which we studied previously in 
Ref. [30] - This data set comprises 27 221 papers posted 
in the "hep-th" section of the Physics E-print Archive at 
arxiv.org between January 1992 and February 2003. The 
data set was compiled by the organizers of the KDD Cup 
challenge, a data analysis competition run as part of the 
annual ACM SIGKDD conference, and incorporates cita- 
tions extracted from data held in the SPIRES database 
at the Stanford Linear Accelerator Center. 

The second data set is a network of citations between 
26 084 legal decisions handed down by the United States 
Supreme Court, from the time of the court's inception in 
1789 until 2006, as compiled by Leicht et al. @. 

From these data we extracted values for fij as de- 
scribed and also calculated the full in- and out-degree 
sequences and used them to evaluate the analytic ex- 
pression (I14p for the same quantity. 

Figure [5] shows separately the value of fij for fixed 
i and varying j (left panels) and for fixed j and vary- 
ing i (right panels) for the two networks. As we can see, 
in all cases the analytic solution for the random graph 
model agrees surprisingly well with the measurements. 
The agreement is not perfect — there are visible differ- 
ences between measurement and theory — but the level 
of agreement is far better than for most other random 
graph models. Certainly the predictions of the configu- 
ration model rarely agree this well with the behavior of 
real-world networks. Thus it appears that, in this case 
at least, the twin inputs of degree sequence and vertex 
order are enough to capture a large part of the variation 
in edge placement in the true citation networks. 

There are other aspects of network structure, however, 
that are not so well captured by our model. An example 
is correlations between the degrees of adjacent vertices, or 
degree assortativity in the nomenclature of Section UlI El 
We consider two kinds of possible degree correlations over 
directed edges: correlations between in- and out-degrees 
at the start and end of directed edges, and correlations 
between in-degrees at either end. In the language of pa- 
per citations, the former is a measure of the extent to 
which highly cited papers are cited more often by prolific 
citers. The latter is a measure of the extent to which 
highly cited papers are more likely to be cited by other 
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FIG. 5: Comparison of empirical measurements (red) and analytic predictions (black) of fa for the two citation networks 
described in the text: preprints on high-energy physics (top) and cases of the United States Supreme Court (bottom). The left 
panel in each case shows fa for citations from times t to time 0.1 (indicated by dashed line). The right panel shows fa for 
citations to times t from time 0.9. Empirical measurements were averaged over windows of size 300 vertices. 



highly cited papers. We have computed correlation co- 
efficients of the form ([20]) for both networks described 
above for both of these types of correlations, as well as 
calculating expected values for random graphs with the 
same degree sequences from Eq. (|2"0")l . 

The results show mixed levels of agreement. For the 
high-energy physics citation network the measured and 
predicted values of the correlation coefficients are in all 
cases very small, indeed negligible for most practical pur- 
poses, so that, although the empirical and theoretical 
values do not agree closely, one could claim that there 
is qualitative agreement between them in that there is 
essentially no correlation present. [For in-degree/out- 
degree correlations we find r = 0.002 (empirical) and 
—0.003 (theory) and for in-degree/in-degree we find r = 
0.040 (empirical) and 0.016 (theory).] 

For the Supreme Court, on the other hand, the cor- 
relations are more substantial and moreover display sig- 
nificant disparity between observed and predicted values. 
For in-degree/out-degree correlations we find r = 0.124 
(empirical) and 0.007 (theory), and for in-degree/in- 
degree we find r = 0.184 (empirical) and 0.022 (theory). 
This appears to indicate the presence of significant phe- 
nomena in the real network that are not captured in the 
model, and illustrates one of the main motivations for the 
creation of random graph models, which is to provide a 



null model that can tell us when an observed property of 
a network differs significantly from what we would expect 
on the basis of chance, and hence draw our attention to 
nontrivial network features. 



VI. CONCLUSIONS 

In this paper we have introduced two random graph 
models for directed acyclic graphs, which are analogous 
to the G(n, m) and G(n,p) models of traditional random 
graph theory. We have defined and calculated a num- 
ber of fundamental theoretical quantities for these mod- 
els, including degree sequences, degree distributions, edge 
and stub probabilities, and degree correlations. We have 
also defined the appropriate infinite-size limit of our mod- 
els and shown that a number of the central quantities of 
the theory simplify in this limit. We have compared the 
basic predictions of the models with two example real- 
world networks, a network of citations between physics 
papers and another of legal decisions, finding surpris- 
ingly good agreement between measurement and theory 
for some properties, but significant divergence in others. 

Starting with the formalism developed in this paper it 
should be possible to compute many other standard net- 
work quantities for random directed acyclic graphs. We 
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believe that the models developed here have the poten- 
tial to shed a significant amount of light on the effects of 
vertex ordering, an important defining property in many 
real-world networks. 
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